11 research outputs found

    Dynamical Modeling of Cloud Applications for Runtime Performance Management

    Get PDF
    Cloud computing has quickly grown to become an essential component in many modern-day software applications. It allows consumers, such as a provider of some web service, to quickly and on demand obtain the necessary computational resources to run their applications. It is desirable for these service providers to keep the running cost of their cloud application low while adhering to various performance constraints. This is made difficult due to the dynamics imposed by, e.g., resource contentions or changing arrival rate of users, and the fact that there exist multiple ways of influencing the performance of a running cloud application. To facilitate decision making in this environment, performance models can be introduced that relate the workload and different actions to important performance metrics.In this thesis, such performance models of cloud applications are studied. In particular, we focus on modeling using queueing theory and on the fluid model for approximating the often intractable dynamics of the queue lengths. First, existing results on how the fluid model can be obtained from the mean-field approximation of a closed queueing network are simplified and extended to allow for mixed networks. The queues are allowed to follow the processor sharing or delay disciplines, and can have multiple classes with phase-type service times. An improvement to this fluid model is then presented to increase accuracy when the \emph{system size}, i.e., number of servers, initial population, and arrival rate, is small. Furthermore, a closed-form approximation of the response time CDF is presented. The methods are tested in a series of simulation experiments and shown to be accurate. This mean-field fluid model is then used to derive a general fluid model for microservices with interservice delays. The model is shown to be completely extractable at runtime in a distributed fashion. It is further evaluated on a simple microservice application and found to accurately predict important performance metrics in most cases. Furthermore, a method is devised to reduce the cost of a running application by tuning load balancing parameters between replicas. The method is built on gradient stepping by applying automatic differentiation to the fluid model. This allows for arbitrarily defined cost functions and constraints, most notably including different response time percentiles. The method is tested on a simple application distributed over multiple computing clusters and is shown to reduce costs while adhering to percentile constraints. Finally, modeling of request cloning is studied using the novel concept of synchronized service. This allows certain forms of cloning over servers, each modeled with a single queue, to be equivalently expressed as one single queue. The concept is very general regarding the involved queueing discipline and distributions, but instead introduces new, less realistic assumptions. How the equivalent queue model is affected by relaxing these assumptions is studied considering the processor sharing discipline, and an extension to enable modeling of speculative execution is made. In a simulation campaign, it is shown that these relaxations only has a minor effect in certain cases

    Extending Microservice Model Validity using Universal Differential Equations

    Get PDF
    When creating models of a system, there is always a tradeoff between the ease of modelling a part and the increased value it brings to the model. Learning a model using machine learning instead, we are able to capture all kinds of things we don't necessarily understand, but can need large amounts of data to learn even the things we find simple. Using universal differential equations we can combine the two, taking scientific models and embedding machine learning into them, with the goal of giving us the best of both worlds. In this paper, we extend existing models with additional unmodelled dynamics using neural networks. We present results from a specific use-case involving a microservice fluid model, where the learned extension improves the range of parameters it can produce predictions over. This allows the optimization of the control parameters to be done to convergence in a single step, instead of collecting new data, in-between iterations, over multiple steps

    Distributed online extraction of a fluid model for microservice applications using local tracing data

    No full text
    Dynamic resource management is a difficult problem in modern microservice applications. Many proposed methods rely on the availability of an analytical performance model, often based on queueing theory. Such models can always be hand-crafted, but this takes time and requires expert knowledge. Various methods have been proposed that can automatically extract models from logs or tracing data. However, they are often intricate, requiring off-line stages and advanced algorithms for retrieving the service-time distributions. Furthermore, the resulting models can be complex and unsuitable for online evaluation. Aiming for simplicity, we in this paper introduce a general queuing network model for microservice applications that can be (i) quickly and accurately solved using a refined mean-field fluid model and (ii) completely extracted at runtime in a distributed fashion from common local tracing data at each service. The fit of the model and the prediction accuracies under system perturbations are evaluated in a cloud-based microservice application and are found to be accurate

    Automatic Differentiation over Fluid Models for Holistic Load Balancing

    No full text
    Microservice applications consist of a set of smaller services interacting in a graph structure to deliver the full application. Jobs will traverse this graph in different paths, both depending on the type of job, but also on the current load of different service replicas. Different paths will incur different scenario-specific costs, dependent on, e.g., deployment and the underlying cloud system. In this paper, we demonstrate how automatic differentiation over data-driven fluid models can be used to optimize a running microservice application, by designing a load balancer that minimizes some holistic cost function under response time percentile constraints. The cost function is based on performance metrics from a fluid model retrieved through logs from the application. The gradient of this cost, with respect to the load balancing parameters, is calculated via automatic differentiation. This enables parameter updates, using e.g. gradient descent, that steers the application towards a setting of less cost. In an experimental evaluation on a small microservice application running on Ericsson Research Datacenter, it is shown that the method can quickly step towards optimal values while supporting complicated cost functions such as solutions to a system of ordinary differential equations

    On Innovation-Based Triggering for Event-Based Nonlinear State Estimation Using the Particle Filter

    No full text
    Event-based sampling has been proposed as a general technique for lowering the average communication rate, energy consumption and computational burden in remote state estimation. However, the design of the event trigger is critical for good performance. In this paper, we study the combination of innovation-based triggering and state estimation of nonlinear dynamical systems using the particle filter. It is found that innovation-based triggering is easily incorporated into the particle filter framework, and that it vastly outperforms the classical send-on-delta scheme for certain types of nonlinear systems. We further show how the particle filter can be used to jointly precompute the future state estimates and trigger probabilities, thus eliminating the need for periodic observer-to-sensor communication, at the cost of increased computational burden at the observer. For wireless, battery-powered sensors, this enables the radio to be turned off between sampling events, which is key to saving energy

    Internal Server State Estimation Using Event-based Particle Filtering

    No full text
    Closed-loop control of cloud resources requires there to be measurements readily available from the process in order to use the feedback mechanism to form a control law. If utilizing state-feedback control, sought states might be unfeasible or impossible to measure in real applications; instead they must be estimated. However, running the estimators in real time for all measurements will require a lot of computational overhead. Further, if the observer and process are disjoint, sending all measurements will put extra strain on the network.In this work-in-progress paper, we propose an event-based particle filter approach to capture the internal dynamics of a server with CPU-intensive workload whilst minimizing the required computation or inter-system network strain. Preliminary results show some promise as it outperforms estimators derived from analytic expression for stationary systems in service rate estimation over number of samples used for a simulation experiment. Further we show that for the same simulation, an event-based sampling strategy outperforms periodic sampling

    Classification of Prognosis in Breast Cancer Patients from AMCL Analysis using Machine Learning Techniques

    No full text
    Predicting the development of distant metastasis for breast cancer patients is of high importance for both the patient and the medical staff. The current best method for prediction is the use of handcrafted histopatological features. The aim with this study is to explore how well Additive Multiple Labelling Cytochemistry (AMLC) stained core biopsy images can predict the development of distant metastasis. For this, two cohorts of a total of 488 patients are investigated, each patient having 1-4 images of AMLC stained core biopsies (of size 2 mm) from the tumour area and AMLC features extracted from those images. Each patient is also supplied with the handcrafted histopatolohical features for reference. Both the images and numerical AMLC features extracted from the images contain information on the immunological response of the patient which in turn has been shown to have potential of good predictive ability. The images were analyzed using convolutional neural networks and the AMLC features with a support vector machine, random forest and linear discriminant analysis classifiers. We show that the convolutional neural networks and the numerical classifiers achieve similar performance with an area under the receiver operating characteristic curve (AUC) value of approximately 0.60, which is worse than the result achieved on the histopathological features, which gave an AUC value of approximately 0.74. Further, we show that no difference in prediction can be observed with using AMLC images containing immunological features as compared to AMLC images not containing any immunological features. Finally, we show that an ensemble of the classifiers for the AMLC features and the images gives no significant boost to performance in terms of AUC value. Data from 488 patients were used in the study; the results indicate that the sample size was too small to capture the variance present between patients. Furthermore the core biopsies were most likely too small to capture the behaviour of the tumour. For future studies, it is recommended to increase the number of patients and the size of the core biopsies

    Latency prediction in 5G for control with deadtime compensation

    No full text
    With the promise of increased responsiveness and robustness of the emerging 5G technology, it is suddenly becoming feasible to deploy latency-sensitive control systems over the cloud via a mobile network. Even though 5G is herald to give lower latency and jitter than current mobile networks, the effect of the delay would still be non-negligible for certain applications.In this paper we explore and demonstrate the possibility of compensating for the unknown and time-varying latency introduced by a 5G mobile network for control of a latency-sensitive plant. We show that the latency from a prototype 5G test bed lacks significant short-term correlation, making accurate latency prediction a difficult task. Further, because of the unknown and time-varying latency our used simple interpolation-based model experiences some troubling theoretical properties, limiting its usability in real world environments. Despite this, we give a demonstration of the strategy which seems to increase robustness in a simulated plant

    Improving the Mean-Field Fluid Model of Processor Sharing Queueing Networks for Dynamic Performance Models in Cloud Computing

    No full text
    Resource management in cloud computing is a difficult problem, as one is often tasked with balancing between adequate service to clients and cost minimization in dynamic environments of many interconnected components. To make correct decisions in these environments, good performance models are necessary. A common modeling methodology is to use networks of queues, but as these are prohibitively expensive to evaluate for many real-time applications, different approximation methods for important metrics are frequently employed. One such method—that provides both transient solutions and short, scalable computation times—is the fluid model, which approximates the dynamics of the mean queue lengths using a system of ordinary differential equations. However, finding a fluid model that can adequately approximate an arbitrary queueing network is in general difficult. In this paper, we extend the state of the art with the following three contributions. First, we show that for any mixed multiclass queueing network of processor sharing and delay queues with phase-type service time distributions, such a fluid model can be found via the mean-field approximation. Furthermore, we propose an improved model based on smoothing of the processor share function that improves the performance of certain systems. Finally, using the smoothed mean-field model, we introduce an accurate closed-form approximation of the response time CDF over any subset of classes and queues. The contributions are further evaluated in a large simulation experiment, which shows that they can be used to accurately predict performance metrics under some system perturbations common in cloud computing

    Towards Performance Modeling of Speculative Execution for Cloud Applications

    No full text
    Interesting approaches to counteract performance variability within cloud datacenters include sending multiple request clones, either immediately or after a specified waiting time. In this paper we present a performance model of cloud applications that utilize the latter concept, known as speculative execution. We study the popular Join-Shortest-Queue load-balancing strategy under the processor sharing queuing discipline. Utilizing the near-synchronized service property of this setting, we model speculative execution using a simplified synchronized service model. Our model is approximate, but accurate enough to be useful even for high utilization scenarios. Furthermore, the model is valid for any, possibly empirical, inter-arrival and service time distributions. We present preliminary simulation results, showing the promise of our proposed model
    corecore